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ABSTRACT 

We show that the galaxy density in the Las Campanas Redshift Survey (LCRS) cannot be perfectly 
correlated with the underlying mass distribution since various galaxy subpopulations are not perfectly 
correlated with each other, even taking shot noise into account. This rules out the hypothesis of simple 
linear biasing, and suggests that the recently proposed stochastic biasing framework is necessary for 
modeling actual data. 

Subject headings: galaxies: statistics — large-scale structure of universe 



1. INTRODUCTION 

Measurements of clustering in upcoming galaxy redshift 
surveys hold the potential of measuring cosmological pa- 
rameters with great accuracy (Tegmark 1997; Goldberg & 
Strauss 1997), especially when complemented by measure- 
ments of the Cosmic Microwave Background (Eisenstein 
et al. 1998; Hu et al. 1998ab). Since such measurements 
are only as accurate as our understanding of biasing, there 
has been a recent burst of work on the relation between 
the distribution of galaxies and the underlying mass. 

Dekel and Lahav (1998; Dekel 1997 §5.5; Lahav 1996 
§3.1) have proposed a robust framework termed "stochas- 
tic biasing" , which drops the assumption that the galaxy 
density p g (r) is uniquely determined by the matter den- 
sity p. Writing the corresponding density fluctuations as 
g = pg/(p g ) — 1 and S = p/(p) — 1, g is modelled as a func- 
tion of S plus a random term. It has been known since 
the outset (e.g., Dressier 1980) that any deterministic bi- 
asing relation g = f(8) must be complicated, depending 
on galaxy type. Even allowing this, however, determinis- 
tic biasing still implies that the peaks and troughs of the 
two fields must coincide spatially, which need not be the 
case. Restricting attention to second moments, all the in- 
formation about both such stochasticity and nonlincarity 
(of /(<$)) can be contained in a single new function r (Pen 
1998; Tegmark & Peebles 1998, hereafter TP98). Group- 
ing the densities into a two-dimensional vector 



(1) 



and assuming nothing except translational invariance, its 
Fourier transform x(k) = J e~ ikr x(r)d 3 r obeys 



(x(k)x(k') f ) = (27r) 3 <5 D (k-k') 
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for some 2x2 power spectrum matrix that we will de- 
note P(k). Here P is the conventional power spectrum 
of the mass distribution, P g is the power spectrum of the 
galaxies, and P x is the cross spectrum. It is convenient to 
rewrite this covariance matrix as 



P(k) = P(k) 



1 

6(k)r(k) 



6(k)r(k) 



(3) 



where b = (Pg/P) 1 / 2 is the bias factor (the ratio of 
luminous and total fluctuations) and the new function 
r = P x I (PPg) 1 ' 2 is the dimensionless correlation coeffi- 
cient between galaxies and matter. The special case r = 1 
gives the simple deterministic biasing relation g = bS, how- 
ever, both b and r may generally depend on scale. 

Since the function r(k) is a cosmologically important 
quantity, it has received much recent attention. Pen (1998) 
has shown how it can be measured using redshift space dis- 
tortions and nonlinear effects, Scherrer & Weinberg (1998) 
have computed r for a number of theoretical models, and 
Blanton et al. 1998 (hereafter B98) have estimated b and r 
from hydrodynamic simulations. TP98 have computed the 
time-evolution of bias in the linear regime, while Taruya 
et al. (1998ab) have generalized this result to the pertur- 
batively non-linear case (see also Mo & White 1996; Mo 
et al. 1997; Matarrese et al. 1997; Bagla 1998; Catelan 
1998ab; Colin et al. 1998; Lemsen & Sheth 1998; Moscar- 
dini et al. 1998; Porciani 1998; Wechsler et al. 1998). Some 
of these numerical and theoretical predictions have been 
borne out in observational data which indicate very high 
bias values (6^4 — 6) at z ~ 3, decreasing rapidly with 
time (Giavalisco et al. 1998). 

In light of all this activity, it would be timely to ob- 
servationally measure r(k). This is the topic of the 
present paper. Unfortunately, measuring r directly re- 
quires knowledge of the true matter distribution S. Al- 
though this information in principle may be obtained from 
e.g. POTENT reconstruction from peculiar velocity mea- 
surements (Dekel et al. 1990), the quasilinear redshift- 
space method of Pen (1998) or gravitational lensing (Van 
Waerbeke 1998), it will likely require better data than is 
presently available since systematic errors in the estimate 
of 5 masquerade as r < 1. We therefore adopt an indirect 
approach, based on the following simple idea. If two differ- 
ent types of galaxies are both perfectly correlated with the 
matter, then they must also be perfectly correlated with 
each other. If we can demonstrate imperfect correlations 
between two galaxy subsamples then wc will have shown 
that r < 1 for at least one of them. We show that r < 1 at 
high statistical significance, and place quantitative limits 
on r in §3. 
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into 6 types or "clans" according to the spectral classifi- 
cation scheme of Bromley et al. (1998ab), ranging from 
very early-type galaxies (clan 1) to late-type objects (clan 
6). To reduce shot noise we define our clan 4 as the com- 
bination of the original clans 4, 5 and 6. The clustering 
properties of these clans vary in a systematic way, reveal- 
ing a progression in the relative bias factors b - see also 
Santiago & Strauss (1992). B98 predict that r depends 
strongly on the formation epoch of a galaxy population, 
so one might suspect that some of the LCRS clans have 
r < 1. 
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FIG. 1 — The four subsamples (clans) of the LCRS. 

The LCRS consists of rif = 327 rectangular fields in the 
sky, and we further subdivide the volume into n s = 10 
radial shells in the range 10 000 km/s< cz <45 000 km/s. 
Discarding galaxies outside of this redshift range (as in Lin 
et al. 1996) leaves 519, 8282, 5152 and 5669 galaxies in the 
four clans. For each of our n = rif x n s spatial volumes 
V a {a = l,...,n) and each clan % = l,...,n c , we count 

(i) 

the number of observed galaxies N& , and compute the 
expected number of galaxies Na using the selection func- 
tion of Lin et al. (1996) with the clan-dependent Schechter 
parameters of Bromley et al. (1998a). We write the map 
of observed density fluctuations for the i th clan as an n- 

dimensional vector gW, defined by g$ = Na^/Na^ — 1. 
In a simple r = 1 linear biasing model, we would have 



h5 a 



(4) 



where 6j is the bias of the i clan, 5 a is the matter density 

(i) 

fluctuation in V a and the shot noise contributions e a have 
zero mean and a diagonal covariance matrix 



N 



(0 
a/3 



for different values of the factor /. If / = bi/bj, then equa- 
tion (Q) shows that the (unknown) matter density fluc- 
tuations 5 a will cancel out, and Ag will consist of mere 
shot noise whose covariance matrix is N = (AgAg*) = 
N«+/ 2 N0'). 

Given the alternative hypothesis that there is a residual 
signal with some covariance matrix S, so that (AgAg*) = 
N + S, the most powerful "null-buster" test for ruling out 
the null hypothesis (AgAg*) = N is using the generalized 
X 2 -statistic (Tegmark 1998) 



v = 



Ag^^SN^Ag - tr N^S 



[2tr {N-iSN-iS}] 



1/2 



(7) 



which can be interpreted as the number of "sigmas" at 
which the noise-only null hypothesis is ruled out. We 
choose S Q/ 3 = £(|r Q — r^l), where r a is the center of vol- 
ume V a and £ is the correlation function measured by the 
LCRS (Tucker et al. 1997). We plot the results in Figure 
2 for three pairs of clans i and j, and the correspond- 
ing valley-shaped curve tells us a number of things. The 
fact that i/ > 1 on the left-hand-side (as / — ► 0, with 
all the weight on clan i) means that there is a strong de- 
tection of cosmological fluctuations above the shot noise 
level (u ~ 1). Likewise, i/ > 1 on the right-hand-side (as 
/ — * oo), which demonstrates cosmological signal in clan 
j. The fact that the curve dips for intermediate /-values 
tells us that the two density maps are correlated (r > 0) 
and have common signal. The minimum is attained at the 
value / which gives the best fit relative bias bi/bj for this 
common signal. However, the fact that v 3> 1 even at the 
minimum proves that even though some signal is shared 
in common, not all of it is: there are no values of bi for 
which equation (Q) can hold for any pair of clans i and j. 

Note that Figure 2 does not directly tell us which clans 
are most correlated, since low minima can signal either 
strong correlations, low fluctuations or high shot noise — 
the latter being the case for the (rare) 1 st clan. 
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FIG. 2 — Significance at which we can rule out that different clan 
pairs are perfectly correlated (that — /g' J ' is pure shot noise). 



Can we rule this out? For a pair of clans i and j, consider 
the difference map 



Ag = gW-/g« 



(G) 



3. MEASURING r(k) 
Having ruled out non-stochastic linear biasing, i.e., 
demonstrated that r < 1 for some clans, let us now study 
in more detail what constraints can be placed on r. 
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3.1. Upper limits on r 

Let us lengthen the 2-dimensional vector x = (<5, g) from 
equation ([l]) to an (1 + n c )-dimensional vector including 
all n c galaxy clans (n c = 4): xq — S and Xi = g^\ 
i = l,...,n c , i.e., x = {5,g^\g^\g^\g^). Next we 
factor the (1 + n c ) x (1 + n c ) covariance matrix P = (xx*) 



as Pij — aiCTj'Rij, where Oi 



= Pj/ 2 is the rms fluctua- 
tion of the i th component and Ry = Pij/aiCTj is the di- 
mensionless correlation matrix. The components = i?oi 
clearly correspond to the correlations we discussed earlier, 
except that there is now one r for each clan, a vector of 
correlations r. These are the numbers we are interested 
in, although all we can measure directly are the remain- 
ing degrees of freedom in R, the galaxy-galaxy correlation 
matrix Q (the n c x n c submatrix in the lower right corner) . 
What constraints can we place? Consider first the simpler 
case involving only n c = 2 two clans. Then 



R 



r' 

Q 




(8) 



where p denotes the one number that we can measure: the 
correlation between clans 1 and 2. Since a correlation ma- 
trix cannot have negative eigenvalues, detR > 0. After 
some algebra, this gives the constraint 



as a compromise. We make the choice F = I, which 
corresponds to measuring the rms fluctuations in the 
cells. Computing the window function W, defined by 
(Gij) = J Pij(k)W(k)d 3 k, we find that Gy mainly probes 
scales around A = 2ir/k ~ 10ft." 1 Mpc. 

Converting G back to Q, we calculate error bars for 
its elements and all quantities derived below by generat- 
ing 10 4 Monte Carlo realizations of Poisson shot noise for 

Na , adding this to the observed values and processing the 
result in exactly the same manner as the real data. Since 
our primary goal is to rule out r» = 1, we clearly do not 
care about sample variance in Q: if biasing were simple, 
one would measure r< = 1 in all samples. When reading 
Tables 1 and 2, the reader should thus bear in mind that 
the error bars reflect shot noise only. 

The results are shown in Table 1. The 2 nd column con- 
firms that the clustering strength drops with clan num- 
ber, as shown by Bromley et al. (1998a). All off-diagonal 
elements in the correlation matrix are seen to be signifi- 
cantly below unity, ranging from the 88% correlation be- 
tween clans 2 and 3 to a mere 42% correlation between 
the earliest and latest galaxy types. Correlations generally 
decrease with separation in clan number, as expected. Ap- 
plying equations (||) or ( |To| ) to this table gives a plethora 
of constraints on r^. For example, equation (|Io| ) tells us 
that either r x or r 4 must be below + 0.42)/2 w 84%. 



pr 2 |< [(l-p 2 )(l-r 2 2 )] 



1/2 



(9) 



Thus the larger we make r 2 , the more tightly constrained 
ri becomes, and vice versa. If r% = 1, then the right hand 
side vanishes, forcing r\ = p. This gives an upper bound 
on the smaller of ri and r 2 . The most weakly constrained 
case is clearly r% — r2, so we obtain 



3.3. The matter as a principal component 

Let Ai > A 2 > ... > A„ c be the eigenvalues of G, sorted 
from largest to smallest, with the corresponding unit 
eigenvectors (Gej = Aje.;, e.j -ej = Sij). It is instructive to 
decompose the fluctuation vector x into its uncorrelated 
principal components t/j = ej • x: 



min{ri, r 2 } < 



1 



P 



-1 1/2 



(10) 



For the case with n c = 4 clans, we get analogous inequali- 
ties for each of the n c (n c + l)/2 = 10 pairs of clans. Addi- 
tional more complicated constraints follow from requiring 
a nonnegative determinant for all 4 x 4-submatrices and 
for the whole 5x5 matrix, although these will not be 
considered here. 

Table 1. Correlations and relative bias for the four galaxy clans. 





Galaxy correlation matrix Q 


1 2.97±.04 

2 1.28±.01 

3 1.21±.01 

4 l.OOi.Ol 


1 .63±.02 .44±.02 .42±.03 
.63±.02 1 .88±.02 .81±.02 
.44±.02 ,88±.02 1 .76±.03 
.42±.03 .81±.02 .76±.03 1 



1=1 



(ii) 



where (yiVj) = 5ij\i- If biasing were simple and linear, 
we would have x = b i5 for some vector b of bias values 
bi. Equation ( pi] ) would therefore have only one term, 
with ei oc b and y\ oc 5. Additional terms in equa- 
tion ( [L l|) would correspond to random physical processes, 
uncorrelated with 5, that are pushing rj below unity. If 
we want a model with a minimal amount of stochasticity, 
we should therefore interpret the first (largest) principal 
component as tracing the underlying matter distribution, 
i.e., assume that 6 = ayi ~ ae*x for some constant a. 
Since (xx*) = G, this assumption gives (x<5) = a(xx t )ei = 
aGei = aAiei, {S 2 ) — a 2 e[(xx t )ei — a 2 e*Gei = a 2 Ai, 
and hence the correlation coefficients 



3.2. Measuring the galaxy correlation matrix 

Qij = Gy/(G,jG J -j) 1 ' 2 , where G is the n c x n c galaxy- 
galaxy covariance matrix (the bottom right submatrix 
of P). Based on the analysis of Tcgmark et al. (1998, 
hereafter T98), we compute an estimate G of the form 



G, 



gW Eg w minus shot noise, removed as in Ap- 



pendix B of T98 by simply omitting self-pairs. The pair 
weighting is given by a matrix E = N _1 FN _1 , where N is 
some typical shot noise covariance matrix - since it is dif- 
ferent for each clan, we use the average N = n^ 1 J^. 



(xjS) 



' 3 ~[<^>w] 1/2 ~\/^ (eiV 



(12) 



These coefficients are shown in the last row of Table 2. 
The fraction of the variance of clan j that is caused by 
matter fluctuations is ((ei) 2 y 2 ) / (x 2 ) = (e^Xi/Gj-j = r|, 
simply the square of the correlation coefficient, so this is 
a useful way to interpret rj. 

Table 2 is indeed consistent with the hypothesis that 
the first principal component traces the matter: since all 



4 



of its four components (boldface) have the same sign, we 
can interpret them as the (relative) bias factors that the 

clans would have if there were no randomness (A2 = A3 = 

1/2 

A4 = 0). Since (ei)j oc G-j oc bjrj, we can also inter- 
pret them as the best fit slopes in linear regressions of Xj 
against S, the quantity that Dekel & Lahav (1998) call 
simply "6" . The sharp decrease of (ei )j towards later clan 
types (their ratios are 5.1 : 1.7 : 1.3 : 1) is thus caused 
by the joint decrease of both the measured relative bias bj 
(the 2 nd column of Table 1) and the correlation rj . 

Table 2. Principal components decomposition of the fluctuations. 



1 


K 


Components of 


eigenvector 


e; 


1 


.564±.013 


.91±.01 


.31±.01 


.23±.01 


.18±.01 


2 


.122±.004 


-.41±.01 


.51±.01 


.60±.01 


.46±.02 


3 


.016±.002 


.01±.02 


-.08±.13 


-.56±.16 


.82±.15 


4 


.008±.001 


-.08±.01 


.80±.03 


-.52±.10 


-0.28±.14 


Clan correlation rj 


.98±.002 


.77±.02 


.60±.02 


.57±.03 



4. CONCLUSIONS 

Our basic conclusion is that bias is complicated. Rather 
than being describable by a single constant bi for the i th 
galaxy type, evidence is mounting that bias is 

1. stochastic and/or nonlinear, requiring a 2 nd quantity 
ri to characterize 2 nd moments, 

2. scale dependent (bi and ri depend on k), 

3. time-dependent (pi and depend on redshift). 

Complication (1) was predicted by Dekel & Lahav (1998), 
and we have observationally confirmed it here. Complica- 
tion (2) has long been observed on small scales (see e.g. 
Mann et al. 1996, B98 and references therein), while Com- 
plication (3) was predicted by Fry (1996) and TP98 and is 
gathering support from both simulations (Katz et al. 1998; 
B98) and observations (Giavalisco et al. 1998). 

However, this is not cause for despair. The scale de- 
pendence is predicted to abate on large scales (Scherrer & 
Weinberg 1998), and the time-evolution due to gravity is 



calculable (TP98). As long as we limit ourselves to second 
moments (power spectra etc.), stochasticity merely aug- 
ments Pi(k) and bi(k) with an additional function ri(k). 
Furthermore, our constraints on using galaxy data alone 
— without knowledge the underlying mass distribution — 
suggest strong regularities: The more similar two morpho- 
logical types are, the stronger they are correlated. 

More strikingly, our tentative identification of the 1 st 
principal component of the galaxy covariance matrix as 
the underlying matter distribution is in excellent agree- 
ment with the recent simulations of B98: galaxies with 
high bias are almost perfectly correlated with matter (we 
find n ~ 98%), whereas the less biased populations have 
weaker correlations (we find r ~ 60% — 80%). Thus matter 
clustering explains only r 2 ~ 40% — 60% of the variance 
of these galaxies. 

This analysis is merely a first step towards observation- 
ally measuring the parameters of stochastic biasing. It re- 
mains to explore the scale-dependence and time-evolution 
of r as well as probing higher-order moments of the joint 
distribution of galaxy types (e.g., Lahav & Saslaw 1992). 
However, our success in constraining r in the absence of 
difficult mass measurements is an encouraging indication 
for the analyses of upcoming galaxy surveys: With the 
techniques presented here, along with theoretical and nu- 
merical modeling, it may be possible to understand bias 
well enough to realize the full potential of these surveys 
for measuring fundamental cosmological parameters. 
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